Goto

Collaborating Authors

 Ahmadi Governorate


On Defining Neural Averaging

Lee, Su Hyeong, Ngo, Richard

arXiv.org Artificial Intelligence

What does it even mean to average neural networks? We investigate the problem of synthesizing a single neural network from a collection of pretrained models, each trained on disjoint data shards, using only their final weights and no access to training data. In forming a definition of neural averaging, we take insight from model soup, which appears to aggregate multiple models into a singular model while enhancing generalization performance. In this work, we reinterpret model souping as a special case of a broader framework: Amortized Model Ensembling (AME) for neural averaging, a data-free meta-optimization approach that treats model differences as pseudogradients to guide neural weight updates. We show that this perspective not only recovers model soup but enables more expressive and adaptive ensembling strategies. Empirically, AME produces averaged neural solutions that outperform both individual experts and model soup baselines, especially in out-of-distribution settings. Our results suggest a principled and generalizable notion of data-free model weight aggregation and defines, in one sense, how to perform neural averaging.



Dual-Domain Deep Learning-Assisted NOMA-CSK Systems for Secure and Efficient Vehicular Communications

Huang, Tingting, Chen, Jundong, Zeng, Huanqiang, Cai, Guofa, Kaddoum, Georges

arXiv.org Artificial Intelligence

Ensuring secure and efficient multi-user (MU) transmission is critical for vehicular communication systems. Chaos-based modulation schemes have garnered considerable interest due to their benefits in physical layer security. However, most existing MU chaotic communication systems, particularly those based on non-coherent detection, suffer from low spectral efficiency due to reference signal transmission, and limited user connectivity under orthogonal multiple access (OMA). While non-orthogonal schemes, such as sparse code multiple access (SCMA)-based DCSK, have been explored, they face high computational complexity and inflexible scalability due to their fixed codebook designs. This paper proposes a deep learning-assisted power domain non-orthogonal multiple access chaos shift keying (DL-NOMA-CSK) system for vehicular communications. A deep neural network (DNN)-based demodulator is designed to learn intrinsic chaotic signal characteristics during offline training, thereby eliminating the need for chaotic synchronization or reference signal transmission. The demodulator employs a dual-domain feature extraction architecture that jointly processes the time-domain and frequency-domain information of chaotic signals, enhancing feature learning under dynamic channels. The DNN is integrated into the successive interference cancellation (SIC) framework to mitigate error propagation issues. Theoretical analysis and extensive simulations demonstrate that the proposed system achieves superior performance in terms of spectral efficiency (SE), energy efficiency (EE), bit error rate (BER), security, and robustness, while maintaining lower computational complexity compared to traditional MU-DCSK and existing DL-aided schemes. These advantages validate its practical viability for secure vehicular communications.


From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification

Abdullah, Abdulhady Abas, Badawi, Soran, Abdullah, Dana A., Hamad, Dana Rasul

arXiv.org Artificial Intelligence

The complexity and difficulties of Kurdish speaker detection among its several dialects are investigated in this work. Because of its great phonetic and lexical differences, Kurdish with several dialects including Kurmanji, Sorani, and Hawrami offers special challenges for speaker recognition systems. The main difficulties in building a strong speaker identification system capable of precisely identifying speakers across several dialects are investigated in this work. To raise the accuracy and dependability of these systems, it also suggests solutions like sophisticated machine learning approaches, data augmentation tactics, and the building of thorough dialect-specific corpus. The results show that customized strategies for every dialect together with cross-dialect training greatly enhance recognition performance.


Adversarial Defense in Cybersecurity: A Systematic Review of GANs for Threat Detection and Mitigation

Ndayipfukamiye, Tharcisse, Ding, Jianguo, Sarwatt, Doreen Sebastian, Philipo, Adamu Gaston, Ning, Huansheng

arXiv.org Artificial Intelligence

Digital transformation of modern society has spread the attack surface of critical infrastructures, enterprise networks, and personal devices. Quick propagation of cyber threats, driven by sophisticated adversarial attacks including evasion[8, 82], data poisoning[21], and backdoor insertions[20, 21], weakened traditional security measures across domains including intrusion detection systems (IDS), Internet of Things (IoT) security, and autonomous networks [19, 82, 127, 138]. These attacks exploit machine learning vulnerabilities, vastly expanding attack surfaces amid the proliferation of IoT devices and distributed systems[35, 58, 59]. Generative Adversarial Networks (GANs), first introduced by Goodfellow et al.[1], have transitioned from synthetic data generation to essential defenses, enabling adversarial scenario simulation, dataset augmentation, and model resilience enhancement[16, 32, 33, 139]. Variants like Conditional GANs (CGANs) and Wasserstein GANs (WGANs) excel in producing realistic samples for anomaly detection and IDS robustness[27, 169, 170], outperforming static signature-based approaches against dynamic threats[60, 169, 173]. Yet, GAN applications in Cybersecurity are fragmented, grappling with training instability, dataset scarcity, edge-device computational constraints, and dual-use risks where GANs facilitate both defenses and advanced attacks[11, 13, 24, 34, 44, 61-63, 79, 80]. Recent advancements, such as GAN-IF models for intrusion detection and AR-GAN for autonomous vehicle defenses, underscore potential in real-time mitigation, but ethical frameworks and unified evaluations remain deficient[78, 81]. This gap necessitates a systematic literature review (SLR) to consolidate GAN architectures, applications, and performance metrics for proactive adversarial defense. 1


Translation from Wearable PPG to 12-Lead ECG

Ji, Hui, Gao, Wei, Zhou, Pengfei

arXiv.org Artificial Intelligence

The 12-lead electrocardiogram (ECG) is the gold standard for cardiovascular monitoring, offering superior diagnostic granularity and specificity compared to photoplethysmography (PPG). However, existing 12-lead ECG systems rely on cumbersome multi-electrode setups, limiting sustained monitoring in ambulatory settings, while current PPG-based methods fail to reconstruct multi-lead ECG due to the absence of inter-lead constraints and insufficient modeling of spatial-temporal dependencies across leads. To bridge this gap, we introduce P2Es, an innovative demographic-aware diffusion framework designed to generate clinically valid 12-lead ECG from PPG signals via three key innovations. Specifically, in the forward process, we introduce frequency-domain blurring followed by temporal noise interference to simulate real-world signal distortions. In the reverse process, we design a temporal multi-scale generation module followed by frequency deblurring. In particular, we leverage KNN-based clustering combined with contrastive learning to assign affinity matrices for the reverse process, enabling demographic-specific ECG translation. Extensive experimental results show that P2Es outperforms baseline models in 12-lead ECG reconstruction.


QualityFM: a Multimodal Physiological Signal Foundation Model with Self-Distillation for Signal Quality Challenges in Critically Ill Patients

Guo, Zongheng, Chen, Tao, Ferrario, Manuela

arXiv.org Artificial Intelligence

Photoplethysmogram (PPG) and electrocardiogram (ECG) are commonly recorded in intesive care unit (ICU) and operating room (OR). However, the high incidence of poor, incomplete, and inconsistent signal quality, can lead to false alarms or diagnostic inaccuracies. The methods explored so far suffer from limited generalizability, reliance on extensive labeled data, and poor cross-task transferability. To overcome these challenges, we introduce QualityFM, a novel multimodal foundation model for these physiological signals, designed to acquire a general-purpose understanding of signal quality. Our model is pre-trained on an large-scale dataset comprising over 21 million 30-second waveforms and 179,757 hours of data. Our approach involves a dual-track architecture that processes paired physiological signals of differing quality, leveraging a self-distillation strategy where an encoder for high-quality signals is used to guide the training of an encoder for low-quality signals. To efficiently handle long sequential signals and capture essential local quasi-periodic patterns, we integrate a windowed sparse attention mechanism within our Transformer-based model. Furthermore, a composite loss function, which combines direct distillation loss on encoder outputs with indirect reconstruction loss based on power and phase spectra, ensures the preservation of frequency-domain characteristics of the signals. We pre-train three models with varying parameter counts (9.6 M to 319 M) and demonstrate their efficacy and practical value through transfer learning on three distinct clinical tasks: false alarm of ventricular tachycardia detection, the identification of atrial fibrillation and the estimation of arterial blood pressure (ABP) from PPG and ECG signals.


Hide-and-Shill: A Reinforcement Learning Framework for Market Manipulation Detection in Symphony-a Decentralized Multi-Agent System

Shi, Ronghua, Liu, Yiou, Ying, Xinyu, Tan, Yang, Feng, Yuchun, Ai, Lynn, Shi, Bill, Wang, Xuhui, Liu, Zhuang

arXiv.org Artificial Intelligence

Decentralized finance (DeFi) has ushered in a new era of permissionless financial innovation--but also opened the door to discourse-driven market manipulation at unprecedented scale. Without centralized gatekeepers or regulatory oversight, malicious actors now coordinate shilling campaigns and pump-and-dump schemes across social platforms and on-chain ecosystems. We propose Hide-and-Shill, a novel Multi-Agent Reinforcement Learning (MARL) framework for decentralized manipulation detection. By modeling the interaction between manipulators and detectors as a dynamic adversarial game, the framework learns to identify suspicious discourse patterns using delayed token price reactions as ground-truth financial signals. Our method introduces three key innovations: (1) Group Relative Policy Optimization (GRPO) to improve learning stability in sparse-reward and partially observable settings; (2) a theory-grounded reward function inspired by rational expectations and information asymmetry, distinguishing price discovery from manipulation-induced noise; and (3) a multi-modal agent pipeline that fuses LLM-based semantic features, social graph signals, and on-chain market data for informed decision-making. T o support scalable and trustless deployment, our framework is integrated within the Symphony system--a decentralized multi-agent coordination architecture that enables peer-to-peer agent execution, trust-aware learning through distributed logs, and chain-verifiable evaluation. Symphony facilitates adversarial co-evolution among strategic actors and maintains robust manipulation detection without reliance on centralized oracles, empowering real-time surveillance across global DeFi discourse ecosystems. Trained on 100,000 real-world discourse episodes and validated in adversarial co-evolution simulations, Hide-and-Shill achieves state-of-the-art performance in both detection accuracy and causal attribution.


Predicting person-level injury severity using crash narratives: A balanced approach with roadway classification and natural language process techniques

Majidi, Mohammad Zana, Karimi, Sajjad, Wang, Teng, Kluger, Robert, Souleyrette, Reginald

arXiv.org Artificial Intelligence

Predicting injuries and fatalities in traffic crashes plays a critical role in enhancing road safety, improving emergency response, and guiding public health interventions. This study investigates the added value of unstructured crash narratives (written by police officers at the scene) when combined with structured crash data to predict injury severity. Two widely used Natural Language Processing (NLP) techniques, Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec, were employed to extract semantic meaning from the narratives, and their effectiveness was compared. To address the challenge of class imbalance, a K-Nearest Neighbors-based oversampling method was applied to the training data prior to modeling. The dataset consists of crash records from Kentucky spanning 2019 to 2023. To account for roadway heterogeneity, three road classification schemes were used: (1) eight detailed functional classes (e.g., Urban Two-Lane, Rural Interstate, Urban Multilane Divided), (2) four broader paired categories (e.g., Urban vs. Rural, Freeway vs. Non-Freeway), and (3) a unified dataset without classification. A total of 102 machine learning models were developed by combining structured features and narrative-based features using the two NLP techniques alongside three ensemble algorithms: XGBoost, Random Forest, and AdaBoost. Results demonstrate that models incorporating narrative data consistently outperform those relying solely on structured data. Among all combinations, TF-IDF coupled with XGBoost yielded the most accurate predictions in most subgroups. The findings highlight the power of integrating textual and structured crash information to enhance person-level injury prediction. This work offers a practical and adaptable framework for transportation safety professionals to improve crash severity modeling, guide policy decisions, and design more effective countermeasures.


Generative Artificial Intelligence and Agents in Research and Teaching

Jauhiainen, Jussi S., Toppari, Aurora

arXiv.org Artificial Intelligence

This study provides a comprehensive analysis of the development, functioning, and application of generative artificial intelligence (GenAI) and large language models (LLMs), with an emphasis on their implications for research and education. It traces the conceptual evolution from artificial intelligence (AI) through machine learning (ML) and deep learning (DL) to transformer architectures, which constitute the foundation of contemporary generative systems. Technical aspects, including prompting strategies, word embeddings, and probabilistic sampling methods (temperature, top-k, and top-p), are examined alongside the emergence of autonomous agents. These elements are considered in relation to both the opportunities they create and the limitations and risks they entail. The work critically evaluates the integration of GenAI across the research process, from ideation and literature review to research design, data collection, analysis, interpretation, and dissemination. While particular attention is given to geographical research, the discussion extends to wider academic contexts. A parallel strand addresses the pedagogical applications of GenAI, encompassing course and lesson design, teaching delivery, assessment, and feedback, with geography education serving as a case example. Central to the analysis are the ethical, social, and environmental challenges posed by GenAI. Issues of bias, intellectual property, governance, and accountability are assessed, alongside the ecological footprint of LLMs and emerging technological strategies for mitigation. The concluding section considers near- and long-term futures of GenAI, including scenarios of sustained adoption, regulation, and potential decline. By situating GenAI within both scholarly practice and educational contexts, the study contributes to critical debates on its transformative potential and societal responsibilities.